Skip to main content

About the Provider

Microsoft is a global technology company and AI platform provider, building large-scale AI systems, research models, and developer tools. Through Microsoft Research and its cloud ecosystem, the company supports open and enterprise AI development across productivity, cloud, and intelligent agent technologies

Model Quickstart

This section helps you quickly get started with the microsoft/Fara-7B model on the Qubrid AI inferencing platform. To use this model, you need:
  • A valid Qubrid API key
  • Access to the Qubrid inference API
  • Basic knowledge of making API requests in your preferred language
Once authenticated with your API key, you can send inference requests to the microsoft/Fara-7B model and receive responses based on your input prompts. Below are example placeholders showing how the model can be accessed using different programming environments.
You can choose the one that best fits your workflow.
import requests
import json
from pprint import pprint

url = "https://platform.qubrid.com/api/v1/qubridai/chat/completions"
headers = {
"Authorization": "Bearer <QUBRID_API_KEY>",
"Content-Type": "application/json"
}
data = {
"model": "microsoft/Fara-7B",
"messages": [
{
"role": "user",
"content": "Explain quantum computing to a 5 year old."
}
],
"temperature": 0.7,
"max_tokens": 4096,
"stream": False,
"top_p": 0.8
}
response = requests.post(
  url,
  headers=headers,
  json=data, 
)
content_type = response.headers.get("Content-Type", "")
if "application/json" in content_type:
  pprint(response.json())
else:
  for line in response.iter_lines(decode_unicode=True):
      if not line:
          continue

      if line.startswith("data:"):
          payload = line.replace("data:", "").strip()

          if payload == "[DONE]":
              break

          try:
              chunk = json.loads(payload)
              pprint(chunk)
          except json.JSONDecodeError:
              print("Raw chunk:", payload)

Model Overview

Fara 7B is a Computer Use Agent (CUA) model designed to take actions on the web to complete user goals. It operates by understanding browser screenshots, tracking previous actions, and deciding the next action required to move toward a task. The model predicts actions step-by-step instead of generating only text.

Model at a Glance

FeatureDetails
Model IDmicrosoft/Fara-7B
ProviderMicrosoft
ArchitectureDecoder-only Transformer
Model Size7B params
Parameters4
Context Length8192 Tokens
Training DataMixed web, curated instructional datasets, code, and multilingual corpora
Dependency ModelQwen 2.5-VL

When to use?

You should consider using Fara 7B if:
  • You are building browser-based automation
  • You need an agent that can take actions, not just generate text
  • Your workflow relies on screenshots + action history
  • You want step-by-step execution of user goals

Inference Parameters

Parameter NameTypeDefaultDescription
StreamingbooleantrueEnable streaming responses for real-time output.
Temperaturenumber0.7Controls creativity and randomness; higher values produce more diverse output.
Max Tokensnumber4096Maximum number of tokens the model can generate.
Top Pnumber1Nucleus sampling that restricts token selection to a probability mass threshold.

Key Features

  • Computer Use Agent (CUA) : Designed to take actions on the web to accomplish high-level user tasks.
  • Multimodal Input : Uses browser screenshots along with text and action history to decide next steps.
  • Step-by-Step Action Execution : Predicts actions with grounded arguments such as coordinates for clicks.
  • On-Device Execution : Provides privacy guarantees and lower latency.

Execution Constraints

The model stops execution at critical points, including:
  • Entering personal information
  • Completing purchases
  • Making calls
  • Sending emails
  • Submitting applications
  • Signing into accounts

Summary

Fara 7B is a 7B parameter Computer Use Agent developed by Microsoft Research.
  • It performs web-based tasks by understanding browser screenshots and text context.
  • The model executes actions step by step using prior action history.
  • Outputs include internal reasoning followed by tool calls for execution.
  • It is designed for automated web workflows with on-device execution support.